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ABSTRACT 

An earlier study of the Kepler Mission noise properties on time scales of primary relevance to 
detection of exoplanet transits found that higher than expected noise followed to a large extent from 
the stars, rather than instrument or data analysis performance. The earlier study over the first six 
quarters of Kepler data is extended to the full four years ultimately comprising the mission. Efforts to 
improve the pipeline data analysis have been successful in reducing noise levels modestly as evidenced 
by smaller values derived from the current data products. The new analyses of noise properties on 
transit time scales show significant changes in the component attributed to instrument and data 
analysis, with essentially no change in the inferred stellar noise. We also extend the analyses to time 
scales of several days, instead of several hours to better sample stellar noise that follows from magnetic 
activity. On the longer time scale there is a shift in stellar noise for solar-type stars to smaller values 
in comparison to solar values. 

Subject headings: methods: observational — stars: activity — stars: late-type — stars: statistics — 
techniques: photometric 


1. INTRODUCTION 

The NASA Kepler Mission has left an indelible imprint 
on exoplanet and stellar properties research through its 
unmatched combination of photometric precision for a 
large number of stars (^150,000), over a long period 
of time (4 y ears) with a stand ard observing cadence of 
30 minutes (|Koch et al.ll2010D . The exquisite time se¬ 
ries returned from Kepler have provided the first results 
for Earth-sized planets potentially i n or near the habit¬ 
able zones of thei r host stars (e.g.. iBorucki et ahl 120131 
[Torres et al.ll2015h . The standard cadence data have rev¬ 
olutionized our ability to pr obe the properties of red gi¬ 
ants with asteroseismology (iBedding et al.l[2Mll) . while 
the limited short-cadence, 1 minute observations have 
similarly revolutioni zed asteroseismology of dwarf stars 
(jChaplin et al.ll^OllH . 

While Kepler photometric time series are excellent 
compared to anything previously available, they are 
not perfect and one of the early surprises in the Ke¬ 
pler Mission was a higher than expected noise level, 
CDPP - Combined Diffe rential Photometric Precision 
(jChristiansen et mi2012ah . a roll-up of all factors of rel¬ 
evance for detection of exoplanet transits with widths of 
3-12 hours. T he Kepler Mission had been designed 
(|Koch et al.ll2010D to have roughly comparable noise lev¬ 
els for fiducial 12th magnitude solar-type stars arising 
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from irreducible Poisson fluctuations, and intrinsic noise 
from the stars, with smaller contributions expected from 
imperfections in the instrument, and the software used 
to provide extracted and calibrated time series. Kepler 
provided the first opportunity to observe stars other than 
the Sun at precision levels allowing well informed infer¬ 
ences about the intrinsic variations of solar-type stars. 
The total noise (CDPP at the nominal 6.5 hours) was 
found to most commonly be 30 parts per million (ppm) 
for 12th magnitu de solar-type s tars, compared to an ex¬ 
pected 20 ppm (| Jenkins! I2002D . This higher than ex¬ 
pected noise level resulted in the need for twice the data 
extent to reach the original mission goals, and was a 
prime motivation in seeking to extend the original 3.5 
year mission. An extended mission was approved to dou¬ 
ble the original extent, however the loss of two (of four) 
reaction wheels brought the prime mission to an end af- 
ter rather precisely 4 years of observing. Analyses by 
iCilliland et al.l (|2011h showed that the primary factor in 
increased CDPP was the contribution from stars, with 
a smaller addition from imperfections of instrument and 
software. 

Several studies have ad dressed general stell ar variabil¬ 
ity with Kepler data. iCiardi et al.l (|201lD presented 
an overview of variabilit y from the first month of data 
over m ost st ellar types. [McQ uill an. Aigrain fc R.obertsI 
(|2012D and iRoberts et ^1~(l2013l) also a n alyzed the 
first month. Basri. Walkowicz fc Reineiil (I2013D and 
iWalkowicz &: Basril (|2013l) used one quarter of data to 
focus on a multi-time scale consideration of solar-type 
stellar variability concluding that the Kepler stellar sam¬ 
ple tended to be q uieter than the averag e Sun, a result at 
mild varia nce withlGi lliland et al.l (1201 111 (herein after Pa¬ 
per 1) and lMcQuillan. Aigrain fc Rober ta (I2012D conclu¬ 
sions. A primary critique bv iBasri. Walkowicz fc Reinerd 
(j 20131) (hereinafter BWR13) with undeniable validity, 
was that CDPP at 6.5 hours of prime relevance for exo¬ 
planet transit detection is not an optimal choice for study 
of stellar variability where longer time scales of several 
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days would better elucidate behavior following from mag¬ 
netic activity and rotation of solar-type stars. 

In this paper we revisit the Paper 1 analyses with two 
primary considerations. First, how do the original con¬ 
clusions regarding noise sources relevant to the detection 
of exoplanet transits change with the consideration of 
data over 4 years, rather than the 1.25 years originally 
used, and with use of data from a more mature data pro¬ 
cessing pipeline providing the time series. That will be 
the topic of Section 3. Second, the topic of Section 4, will 
be a consideration of noise for solar-type stars following 
adoption of metrics on a longer time scale of greater rel¬ 
evance to primary evidence of magnetic activity induced 
changes. Section 5 provides results on simulating the 
expected distribution of this longer timescale variability 
metric. 

2. KEPLER OBSERVATIONS, DATA RELEASES, AND 
PRIMARY NOISE METRIC 

Paper 1 provided an extensive discussion of how the 
Kepler photometer operated, the selection of targets rel¬ 
evant to exoplanet detection, and hence the focus of the 
noise source study. Also considered in detail were the pri¬ 
mary noise (Poisson from stars, Poisson from sky back¬ 
ground, readout noise, instrument and/or software im¬ 
perfections, and intrinsic variability of the stars scaled 
from solar observations) terms expected to be important 
for CDPP. Rather than attempting to condense an orig¬ 
inal three page discussion setting the stage for our pri¬ 
mary study of noise contributions we refer the interested 
reader to Section 2 of Paper 1. 

The data considered in this paper follow from three 
epochs: 1) As in Paper 1 the original release of Quarters 
2 through 6 in 2009 to 2010. Quarter 2 was re-released in 
the middle of this epoch bringing the treatment of all five 
quarters to a roughly consistent level. 2) Quarters QO - 
Q14 as uniformly reprocessed in early 2013. Quarters 15 
-17 were released at a similar level of software shortly 
after this. 3) All quarters as uniformly reprocessed in 
late 2014. 

The early releases of data within three months of hav¬ 
ing been telemetered to the ground used the Science Op¬ 
erations Center (SOC) Pipeline 6, with 6.3 being repre¬ 
sentative of Quarters 2-6 data as analyzed in Paper 
1. Removal of instrumental systematics, the key step in 
producing calibrated Kepler time series was handled via 
a least squares regression with basis vectors associated 
with pointing records, temperature records, and inferred 
telescope focus values. For the early data releases the cal¬ 
ibrated data generated by the Presearch Data Condition¬ 
ing (PDC) module for which systematics have been re¬ 
moved was referred to as ap_corrJlux in the fits files. 
Details of processing may be found in the Data Release 
Notes applicable to Q uarter 5 as a representative case 
(|Machalek et al.ll2010ll. the Kepler D ata Characteristics 
Handb ook (iChristiansen et al.ll2012bll . and iJenkins et all 
(|2010ll . While this early software did a good job of 
removing instrumental systematics, inspection of light 
curves (as discussed in the Data Release Notes) would 
sometimes show clear evidence of spurious signals being 
introduced, as well as frequent removal of likely real stel¬ 
lar variability. 

To address the common suppression of stellar signals a 
Baysian approach to PDC was introduced in Kepler SOC 


version 8.0. This Baysi an maximum a posteriori (MAP) 
appro ach to cotrending (iStumne et al.l[20T^ iSmith et al.l 
[2(illl more effectively removed common mode instru¬ 
mental systematics while preserving stellar signals. This 
earliest version of PDC-MAP was first applied to Quarter 
9 data, as then used in BWR13. 

The 2013 data releases were the first time that a uni¬ 
form reprocessing for the bulk of Kepler mission data 
was performed. This used the SOC Pipeline 8.3. The 
primary change for this data release is that PDC uses 
wavelet decomposition and multiple temporal scales in 
performing the MAP processing. It decomposes each 
light curve into three characteristic bands, thus improv¬ 
ing the ability to deal with instrumental systematics, 
while still preserving intrinsic stellar signals at short to 
moderate (^20 days) timescales. The longest band (>21 
days) performs a simple robust fit to cotrending basis 
vectors evaluated for this temporal band. Stellar sig¬ 
nals at timescales significantly longer than this may be 
severely suppressed. The middle band of 2 hours to 21 
days performs a MAP fit. The shortest band preserves 
all signals, i.e. no detrending is performed. The software 
evaluates on a star-by-star basis whether to invoke the 
multi-scale MAP (msMAP), or if on the basis of a good¬ 
ness metric calculated by PDC regular MAP performs 
better this is used to provide the calibrated time series 
(PDCSAP_FLUX) in the fits file. About 90% of the time 
msMAP is adopted. Details of t his processing may b e 
found in ISmith et al.l (1201211 and Sturnoe et al.l (j2012ll , 
with an update in iStumne et al.l ( 20141 1. Quarters 15 - 
17 were processed by slightly later versions of the SOC 
Pipeline, but the changes were generally not such as to 
fundamentally affect noise characteristics. 

The third epoch of data releases in late 2014 considered 
here has been the only time that all Kepler data were 
processed consistently with the same version of the Ke¬ 
pler pipeline. The large change introduced for SOC 8.3 
of msMAP was retained. The primary advance for this 
newest data release were improvements to the lower-level 
treatment of data at the pixel level, e.g. a more advanced 
consideration of overscan in order to better deal with 
some of the more serious sources of i nstrumental sys- 
temat ics at a root level. For details see lThompson et al.l 
(|2015ll . This processing used SOC Pipeline 9.2. 

3. STELLAR AND INSTRUMENTAL NOISE 
DECOMPOSITION 

3.1. Summary of Original (and Current) Approach 

To facilitate determining the relative importance and 
quantitative values of several terms contributing to 
CDPP we focused on a study of a subset of the full Ke¬ 
pler sample expected to have comparable contributions 
from the primary terms of simple Poisson fluctuations, 
intrinsic stellar variability, and instru ment /softwa.re im- 
perfections. By design of the mission (iKoch et al.ll201(ill 
this led us to focus on stars of roughly s olar-type, and 
Kepler magnitude, Kp (|Brown et al.ll20lill . of 12.0 ± 0.5. 

We directly modelled contributions of noise from Pois¬ 
son terms on the stellar and sky fluxes, as wel l as the 
known CCD readout noise (jChristiansen et HI l2012bf l 
for each Kepler CCD and removed these before at¬ 
tempting to separate out stellar variability and instru¬ 
ment/software terms. 

Kepler observations were conducted on the same stel- 
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lar field, with primarily the same targets throughout the 
prime mission. Four times during each Kepler orbit of 
the Sun, the spacecraft was reoriented by 90 degrees 
(jVan Cleve fc Caldwelll[200^ in order to keep the solar 
panels illuminated, and spacecraft radiator in shade. The 
progressive reorientation results in sets of stars cycling 
through four (of 84 total) CCD channels, thus provid¬ 
ing the primary leverage used to disentangle instrument 
and intrinsic stellar contributions. Considered as an en¬ 
semble, if the noise of one set of stars changes as they 
cycle through 4 CCD channels, then this demonstrates 
that the electronics associated with those channels con¬ 
tribute different levels of noise. Through adoption of a 
Singular Value Decomposition (SVD) formalism we ob¬ 
tained noise terms in time (global value associated with 
each quarter as might follow from unique operation of 
the instrument, or external factors such as solar parti¬ 
cle fluence), space (the individual CCD channels), and 
for the stars. The SVD forma lism follows the di scussion 
in, and uses subroutines from iPress et al.l (1199211 for the 
solution of a highly over-determined (more observables 
than unknowns) set of general linear least-squares equa¬ 
tions with degeneracies present. A key assumption was 
that ensembles of stars nearby on the sky should have the 
same intrinsic variability, thus allowing us to put the in¬ 
dependently determined relation of quartets of channels 
on a common scale. 

The original study considered a number of factors such 
as dependence of stellar noise on galactic latitude, crowd¬ 
ing of sources, and the influence of fainter, superposed 
background stars. These proved to be of second order 
and will not be considered here. We refer interested 
readers to Sections 3.1 through 3.8 of Paper 1 for a full 
discussion of our approach. In the remainder of this sec¬ 
tion we focus on results applying the SVD formalism as 
before to updated data products, and the use of all 17 
quarters of data instead of the original 2-6. 

Since four years have passed since the original analysis 
was performed, we started by locating the original codes, 
recompiling, and attempting to replicate the sequential 
analyses of the original study, using as well the data prod¬ 
ucts used for the 2011 study. This was successful in that 
new analyses of the original data resulted in exactly the 
results quoted in Tables 1, 2 and 4, and shown in Figure 
8 of Paper 1 giving primary noise separation values. 

3.2. Repeat of Original Updated to New Data Products 

The 2011 study used time series produced within three 
months of the end of each quarter, the last one analyzed 
(Q6) having been written in December 2010. There have 
since been two primary releases in which most, or all of 
the prime mission data were reanalyzed with more ma¬ 
ture software at the Science Operations Center for Ke¬ 
pler. We will provide results separately for the processing 
version 8.3 data released over April through December 
2013 for all 17 quarters, and processing version 9.2 re¬ 
leased over November through December 2014 for all the 
data. 

Minor software adjustments needed to be made to ac¬ 
commodate the newer fits formats of the 8.3 and 9.2 data 
sets, as well as minor modifications in a few cases for 
date ranges provided in individual quarters. With the 
exception of such details, we have performed analyses in 
exactly the same way as in Paper 1. 


Table 1 

Quarter-to-quarter excess variance. 


Version 

Q'2 



Qs 

Q6 

6.3 

210.46 

105.82 

44.52 

0.00 

29.89 

8.3 

62.35 

0.00 

15.02 

23.15 

8.64 

9.2 

18.23 

0.00 

3.01 

28.46 

0.04 


Note. — Variances in ppm^ over the five quarters of Kepler data 
analyzed. Variance of quietest quarter is forced to zero. Version 
refers to SOC Pipeline version number used. 


Adoption of the new data products led to rather dra¬ 
matic shifts in the noise levels attributed to individual 
CCD channels (or imperfections in the pipeline software 
used to analyze them), as well as dramatic shifts in the 
noise levels attributed to each individual quarter in a 
global sense. This was initially a cause for concern, that 
perhaps the analyses were either inherently unstable, or 
inadequately executed. The linear correlation of vari¬ 
ances inferred per channel between the original study 
(see Paper 1, Table 2) and the new one using updated 
data products was only ~0.5. Flowever, examination of 
the inferred intrinsic stellar variations between the orig¬ 
inal data products for Q2-6, and the newer versions of 
the same data came in at greater than 0.97. The stars 
of course had intrinsically the same behavior indepen¬ 
dent of how the data were analyzed to remove various 
systematic effects from the time series. The SVD proce¬ 
dure successfully returned nearly identical behavior for 
the stars, while showing different and generally smaller 
noise levels in time and across the detector channels for 
the more recently processed data. 

Table 1 shows the assigned quarter-to-quarter excess 
variance for the first five full quarters, with the first line 
being from Paper 1. In successive full data releases 8.3 
and 9.2 the variance (square of noise) drops dramati¬ 
cally for quarters 2 and 3 which had been most affected 
by systematics. This behavior was expected since most 
pipeline development after the mission start was devoted 
to dealing with and suppressing systematics arising from 
imperfection in detector electronics and operational and 
environmental variations. 

Over the three data release versions shown in Table 1 
the global variance attributed to intrinsic variations of 
the stars was held fixed, and as noted above the star- 
to-star variances were reproduced at a very high level 
of fidelity across these. The mean excess variance over 
Quarters 2-6 is 78.1, 21.8 and 9.9 ppm^ over data pro¬ 
cessing releases using SOC 6.3, 8.3 and 9.2 respectively. 

We defer showing the individual contributions per 
channel as in Table 2, or Figure 12 of the original study 
until the next section when data from the full mission 
are used to set this. With the mean stellar and Poisson 
contributions held fixed, it is worth noting that the mean 
excess variance from both Table 1, plus the per-channel 
excesses drops from 181 ppm^ in the data releases made 
within three months of each quarter end, to 137 ppm^ for 
release 8.3, and finally to 98 ppm^ for release 9.2. Since 
the sum of stellar and Poisson terms is 664 ppm^ the 
component of noise attributable to imperfections in the 
detector electronics and the inability of detrending soft¬ 
ware to perfectly compensate has become an increasingly 
minor contributor to the overall noise budget, reflecting 
positive changes in the pipeline software producing cor- 
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rected time series. 

3.3. Extension from Quarters 2-6 to full Quarters 1-17 

We have performed full mission analyses for data re¬ 
leases 8.3 and 9.2. The SVD analysis procedures remain 
unchanged, but now rather than having a nearly minimal 
solution basis in which each quartet of stars visited most 
detector channels only once (with redundancy of quarters 
2 and 6), there is now a four-fold redundancy with each 
set of stars cycling through the same detector multiple 
times. We carry the same assumption as before, namely 
that the ensemble properties of the stars remain fixed in 
time, and to first order in space as well. Over a four 
year time span some individual stars are likely to have 
shown significant evolution of intrinsic noise within the 
three-month quarterly intervals, certainly in going from 
minimum to maximum conditions the Sun shows signif¬ 
icant variations. The SVD solution relies on having an 
average of 116 stars per quartet, i.e. the individual sets 
cycling through the detector channels, and it is a reason¬ 
able assumption that stellar cycle variations are not syn¬ 
chronized and the ensemble of 116 stars remains sensibly 
fixed. The intrinsic stellar variance star-to-star derived 
from Quarters 2-6 has a linear correlation of 0.959 with 
the same as derived from Quarters 1-17, thus the evolu¬ 
tion of intrinsic noise level for individual stars is shown 
to be modest (for the two sets of 9.2 data). 

The quarter-to-quarter global excesses are shown in 
Table 2. The quietest quarter over 2-16 (the full length 
quarters) is forced to zero within the SVD solution, and 
this happens to be Quarter 9 for both the SOC 8.3 and 
9.2 data releases. The small value shown for Quarter 17 
is likely an artifact of this being only about one month 
long. The SOC pipeline detrending removes signal on 
shorter time scales for this shorter than normal quarter. 
The mean of changes over time in the two independent 
pipeline processing cases are modest: 50.5 ppm^ on av¬ 
erage at 8.3, and 46.6 ppm^ for data release 9.2. The 
changes across time are generally well understood. High 
values for quarters 1 and 2 result from a break-in pe¬ 
riod of less than optimal management of Kepler , e.g. 
the presence of variable guide stars removed for later cy¬ 
cles, and multiple safings and repointings in Quarter 2. 
Higher values later in the mission. Quarter 12 in partic¬ 
ular phase well with measures of solar activity indica¬ 
tive of increased particle fluxes encountered by Kepler 
as the Sun transitioned to solar maximum activity. A 
proxy for what Kepler will have experienced is given by 
the Planetary Ap index (|Siebert fc MevedI 197111 in Table 
2. This is an average over measurements of disturbance 
levels in two horizontal field components observed at 13 
selected, subauroral stations. Since Kepler was offset by 
as much as 0.4 AU from the Earth at the end of mis¬ 
sion, an Earth-based metric is only a rough indication 
of the environment at Kepler. These results were taken 
from http://www.solen.info/solar. Other solar activity 
indicators such as sunspot number, 10.7 cm flux, or flare 
counts also show rising trends with time and a good cor¬ 
respondence with the rise in Kepler noise in later quar¬ 
ters. 

The next step in obtaining a separation of error terms 
between the instrument (or residual inability of pipeline 
software to remove the results of instrumental imperfec¬ 
tions) and stars is to solve for instrumental terms within 


Table 2 

Quarter-to-quarter excess variance. 


Quarter 

8.3 

9.2 


1 

219.06 

232.03 

4.53 

2 

95.07 

58.55 

5.43 

3 

3.25 

9.67 

2.79 

4 

31.65 

23.85 

3.48 

5 

36.36 

49.47 

8.32 

6 

20.44 

16.14 

7.04 

7 

23.84 

32.46 

5.00 

8 

72.68 

72.19 

6.65 

9 

0.0 

0.0 

8.87 

10 

23.56 

20.41 

9.68 

11 

55.49 

68.50 

5.23 

12 

125.57 

127.96 

11.17 

13 

57.53 

63.78 

8.90 

14 

78.85 

62.93 

10.07 

15 

78.33 

44.11 

6.12 

16 

52.92 

48.44 

7.53 

17 

-39.65 

-42.62 

6.85 


Note. — Variances in ppm^ over all 17 quarters of Kepler data 
analyzed. The two columns are for primary data processing release 
8.3 (mid-2013), and 9.2 (late-2014). Ap is the Planetary A index 
IlSiebert fc Meverll 197111 . 

each quartet of channels, while at the same time solv¬ 
ing for the intrinsic variance of each star. The quartets 
are then placed on a common scale by requiring that 
the ensemble average of stars within each quartet have a 
common value. Figure [T] shows how the new by-channel 
variances compare to those found in Paper 1. For 52 of 84 
channels (62%) the variance ascribed to the instrument 
has dropped. In the original study the channels having 
poorer focus correlated strongly with a linear correla¬ 
tion coefficient of -0.63 between variance and focus. In 
the new set of by-channel variances using the 9.2 data 
release and all data as input, this correlation drops to 
-0.34. The correlation of excess noise with poor focus 
is still noticeable, however this has been reduced signifi¬ 
cantly in amplitude. 

3.4. Summary of Changes Using Newer and More Data 

In repeating the original noise study (Paper 1) using 
the current data release (following four years of soft¬ 
ware development for the pipeline), and all four years 
of data we have found generally expected results. The 
noise levels attributed to the individual solar-type stars 
have changed very little with adoption of the newer data 
release; a gratifying result since pipeline updates cannot 
have affected the stars. Figure [2] shows the updated ver¬ 
sion of Figure 8 from Paper 1, the inferred intrinsic stellar 
noise, now based on all quarters with use of up-to-date 
pipeline processing inputs. Only differences of minor de¬ 
tail can be noted with respect to the original. The noise 
levels inferred for individual channels on the instrument 
have dropped with the inclusion of more, and most sig¬ 
nificantly more recently processed data. The software 
developments within the pipeline were of course moti¬ 
vated in large part to reduce the excess noise attributed 
to the instrument. The fraction of variance attributed to 
factors potentially under the control of software develop¬ 
ment has dropped from 22% four years ago, to 13% now. 
This is of course an over-simplified view. The importance 
of changes for various applications depends not only on a 
gross measure of noise level, but also on detailed charac¬ 
teristics of residual noise. Similarly the intrinsic stellar 
noise may be amenable to suppression for some applica- 
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Figure 1. By-channel intrinsic variance levels are plotted against 
by-channel focus as represented by the fraction of total energy in 
the central pixel for a star centered on a pixel. Channels overplot- 
ted with a small circle represent nine cases independently identi¬ 
fied to have moderate Moire pattern noise, and the ten cases with 
strong Moire noise have doubled circles added. Values from the 
original study, and the new 9.2 data release based on all quarters 
are both plotted, connected with a green line when the new solution 
has smaller variance, red when larger. 

tions. Nonetheless a consistent picture has developed of 
considerable improvement in the pipeline-calibrated data 
products over time. 

4. USE OF LONGER TIMESCALE NOISE METRIC 

CDPP was designed to capture those components of 
noise and intrinsic stellar variability of greatest relevance 
to the detection of low-amplitude exoplanet transits hav¬ 
ing characteristic time scales of 3 to 12 hours. Such a 
metric need not be, and indeed is not an optimal one for 
other studies such as determining the intrinsic variability 
of solar-type stars. The CDPP metric depends on the low 
frequency tail of variability resulting from stellar granula¬ 
tion, and only the high frequency tail of variability result¬ 
ing from magnetic activity induced variations. If inter¬ 
ested in the stars it would be better to consider multiple 
metrics that individually capture the primary sources of 
variability. The “flicker”, or root mean sq uare variation 
of sta rs on timescales shorter than 8 hours (|Bastien et al.l 
[Mil has been useful for characterizing variability at 
high frequencies, with result ing ability to m e asure stel¬ 
lar gravities, as improved bv iKallinger et ahi (j2014[l . At 
longer timescales the measure of intrinsic stellar behav¬ 
ior is more difficult given the likelihood of contamination 
from systematics in the Kepler data. Once a month Ke¬ 
pler suspended science operations to re-point the fixed 
high-gain antenna toward the Earth to telemeter accu¬ 
mulated data to the ground. This resulted in thermal 
perturbations to the telescope and photometer introduc¬ 
ing photometric changes large compared to the stellar 
variations of quiet solar-type stars. To deal with these 
systematics detrending was introduced that very success¬ 
fully removed many common mode variations from the 
instrumental drifts, but at an additional cost of suppress¬ 



Figure 2. Upper panel shows the intrinsic stellar noise in ppm 
for the Kp = 11.5 to 12.5 sample as a function of galactic latitude. 
Medians evaluated up to 100 ppm are shown as ‘o’, while means 
from up to 3 X the median at each degree of galactic latitude are 
shown as “+’ symbols. Standard errors for the means are shown. 
The lower panel shows a histogram of number of stars per ppm 
bin. The mean and rms distribution for solar noise levels over 
Quarter-long intervals spanning a solar Cycle are shown by the ‘-I-’ 
and heavy horizontal line, with the full extent of solar noise per 
Quarter the thin line. 

ing true stellar signals in some regimes and introduc¬ 
ing uncertainty in the final product. Given the roughly 
month-long rotation period for quiet solar-type stars, and 
the monthly cadence of Kepler pointings, recovery of in¬ 
trinsic stellar signals on timescales of several days most 
useful for characterization of activity variations was thus 
made challenging. 

BWR13 have used two primary metrics to encapsulate 
stellar variations on activity timescales. These are physi¬ 
cally well motivated, and useful for characterizing stellar 
activity variations. Caution, however, is due in appli¬ 
cation to Kepler data where the pipeline calibration of 
data may well suppress some variations of relevance to 
forming these statistics. The fir st diagnostic, Rvar was 
introduced bv IBasri et al.l (|2011ll . this “range” parame¬ 
ter is found by sorting all the photometric data points in 
a given interval (30 days generally adopted), then tak¬ 
ing the difference between the 5% and 95% points (to 
avoid anomalous excursions) in this distribution. The re¬ 
sulting statistic will be sensitive to both short timescale 
excursions (if lasting more than 5% of the time dur¬ 
ing 30 days) as might follow from star spots, and more 
generally to longer timescale variations approaching the 
monthly intervals adopted. Since the Kepler systematic 
noise removal process (especially the latest msMAP ver¬ 
sion) strongly suppresses noise on timescales as short 
as 30 days, we consider the Rvar parameter problem¬ 
atic for interpretation of Kepler data, especially when 
trying to obtain an absolute comparison of placing the 
Sun within the distribution of stars assessed with Kepler. 
The second primary measure of variability introduced in 
BWR13 is the median differential variability MDV(tbm). 
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The MDV measures the variability by forming bins of 
length thin, then taking the absolute difference between 
adjacent bins. The MDV follows as the median value 
of the time series of absolute differences. We have cho¬ 
sen to focus on MDV with tf,in = 8 days. Figure 8 of 
BW R13 compares 8-davJ VIDV for the Sun from SOHO 
data (iFrdhlich et al.lHOQ^ considering 30 day blocks over 
a full solar cycle, to an ensemble of bright Kepler stars 
for one quarter (Q9) of data. In the comparison given 
in their Figure 8 a significant fraction of the solar points 
have MDV values well below the overall minimum value 
reached for about 1,000 Kepler stars - an implausible 
result suggesting that either something went wrong with 
evaluating the solar or Kepler values, or that the Ke¬ 
pler time series have significant residuals on time scales 
longer than 8-days creating this offset. 

In order to further pursue a comparison of Kepler stars 
with the Sun in the 8-day MDV statistic, given the sur¬ 
prising result of many solar values disjoint from the ex¬ 
trema of 1,000 Kepler stars we have formed our own met¬ 
rics. Figure [3] shows the distributi on of solar values for 
30 3-month long intervals of SOHO (|Fr6hlich et aLlfTQQTfl 
data, the same set as discussed in Paper 1, and the 4,529 
solar-type Kepler stars with Kp < 12.5. Differences with 
respect to the BWR13 study include: (1) We compute 
8-day blocks over 90 days (or length of Kepler quarter of 
data), rather than within 30 day intervals to form MDV. 
(2) We take the median over all 17 quarters of Kepler 
data, rather than adopting Quarter 9. (3) We use the 
latest available (9.2) data release. None of these differ¬ 
ences are significant. Our distribution of 8-day MDV 
values for the Sun compared to Kepler stars is radically 
different than that in Figure 8 of BWR13. In partic¬ 
ular the distribution of MDV values for the Sun falls 
within the extent of stellar values from a large ensem¬ 
ble of stars. The radically different distribution follows 
primarily from our values for the solar MDV. Our min¬ 
imum solar MDV is about 0.04 ppt, while the BWR13 
value is about 0.0015 ppt. We differ by over an order of 
magnitude in scale for the solar MDV at 8 days. 

In order to pursue the latter discrepancy we have 
compared records of solar variations used, with the 
time series used in BWR13 kindly provided by G. 
Basri. The latter authors used the mean of “green” and 
“red” VIRGO (SPM) data from SOHO, starting with 
hourly cadence data linearly interpolated to half-hour to 
roughly match the Kepler cadence. Paper 1 also used 
VIRGO/SOHO data, but started with a compilation at 
60 second intervals and binned this to 29.4 minutes. We 
also adopted just the “green” channel and scaled this by 
0.79 to adjust amplitudes to the longer average wave¬ 
length of Kepler. Figure |4] shows a representative 0.5 
year interval between the adopted solar records of the 
two studies. A detailed comparison shows numerous dif¬ 
ferences, but these are at the level of influencing 8-day 
metrics at the 10% level, not the nearly factor of 20 found 
in our two sets of 8-day MDV metrics for the Sun. Our 
difference from BWR13 for the solar 8-day MDV does 
not follow from minor differences in color or sampling 
for adopted solar records. 

Ironically our comparison of solar and Kepler 8-day 
MDV better support a primary contention of BWR13 
that the relative noise levels intrinsic to the stars com¬ 
pared to the Sun are lower than concluded in earlier stud- 



Figure 3. Black dots show the median values over all 17 quarters 
for the 4,529 Kepler solar-type dwarfs for the BWR13 8-day MDV 
and Range metrics as computed in this study. The red crosses 
show the same statistics for 30 90-day intervals of solar time series 
spanning a full solar cycle. 



0.0 0.1 0.2 0.3 OA 0.5 

Time (years) 

Figure 4. Upper panel shows a 0.5 year interval of the solar 
activity record adopted by BWR13. The lower panel shows the 
solar record for the same interval as used both in Paper 1 and this 
study. 

ies of Paper 1 and lMcQuillan. Aigrain fc RobertsI (|2012[1 
than does their own result for this metric. We defer 
further discussion of typical activity levels of solar-type 
Kepler stars relative to the Sun until after discussion of 
results from our favored long timescale noise metric. 

4.1. Adoption of a CDPP-style Metric with Longer 
Timescale 

The GDPP metric of Paper 1 and in Section 3 above 
starts with the calibrated (instrumental signatures re¬ 
moved to the extent possible) pipeline data, removes a 
running 2-day quadratic polynomial fit to the time se¬ 
ries, block averages into 6.5 hour intervals, then eval¬ 
uates the standard deviation for quarter-long segments. 
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Figure 5. Light curves show filter components from the 24 day 
quadratic polynomial filtering (zero response at zero frequency), 
and sine function representation of the 3.25 day binning adopted 
for the long timescale CDPP. The bold curve shows the adopted 
net response function plotted against frequency with 50% transfer 
periods of about 8 and 15 days flagged. 

We choose here to adopt exactly the same procedure, but 
now use timescales longer by x 12. We start with a 24-day 
quadratic polynomial fit that will preserve signal at much 
longer intervals than the standard 6.5 hour CDPP, then 
follow this with binning into 3.25 day intervals before 
evaluating the standard deviation. Figure [5] shows the 
response function of our filtering and binning operations 
for this long timescale CDP P. The 50% transfer points 
are at about 8 and 15 days. lAigrain. Favata fc Gilmo^ 
(|2004ll found a timescale of 9.8 days best characterized 
solar activity variations, rather than the rotation period 
of ~26 days. Our metric nicely spans the timescale of 
9.8 days. We have also directly verified that the 8 - 
15 day bandpass represents solar variations at high fi- 
delity by evaluating i t for 30 “quarters” of SOHO/Virgo 
(|Frohlich et al.l 1199711 data spanning a full solar cycle. 
The 8-15 day bandpass correlates at the 85% level with 
a 8 - 30 day bandpass measure. The upper range of met¬ 
ric being set at 15 days avoids most of the damping inher¬ 
ent with the calibrated data - which at 30 days would be 
largely removed, and at 20 days would be uniquely per¬ 
turbed star-to-star and quarter-to-quarter. By design 
this timescale was chosen to be as long as possible with¬ 
out the long timescale end already having been signifi¬ 
cantly suppressed with instrumental systematics removal 
in the Kepler pipeline processing. Figure |6] illustrates 
the damping introduced by the pipeline version (“regular 
MAP”, data release 8.0) used in BWR13, as well as the 
data products more recently available (“msMAP”, data 
release 9.2). This figure supports the selection of an 8 
-15 day bandpass filter for our primary long-timescale 
metric. This longer timescale CDPP would no longer be 
relevant to the detection of 3 - 12 hour transits, but is 
well suited to attempting to characterize activity induced 
variations in a sample of solar-like stars. 

With this much longer timescale metric, and considera- 



Figure 6. This plot shows what fraction of existing signals in the 
form of injected test sinusoids at amplitudes corresponding to one 
standard deviation of the underlying time series are preserved by 
data release 8.0 (‘regularMAP’ - upper panel), and data releases 
8.3 and 9.2 (‘msMAP’ - lower panel). Signal corruption of unity 
corresponds to total loss of signal, while small values indicate high 
fidelity retention of input signals through the systematics removal 
step. The corruption metric is qualitatively the same as fractional 
damping of the signal amplitude. Clearly, for the current ‘msMAP’ 
signals with periods >20 days are severely damped. Not shown 
are more subtle, and less well characterized dependencies on signal 
amplitude. Larger input signals show relatively better preservation 
at long periods, while smaller amplitudes show more damping. 

tion of the same stellar sample used in Paper 1 some pre¬ 
viously relevant noise terms are now unimportant. At 6.5 
hours for CDPP the Poisson noise was roughly compara¬ 
ble to the intrinsic stellar term. At the x 12 longer metric 
the intrinsic stellar term rises due to better sampling pri¬ 
mary timescales of stellar activity, while the Poisson term 
drops by y/T2. Factors from readout noise on the CCDs, 
Poisson fluctuations on the counts, and sky background 
are now unimportant. 

We have attempted to pursue the same type of Singu¬ 
lar Value Decomposition to isolate noise terms associated 
with individual quarters, the stars themselves and con¬ 
tributions from the instrument. This has been relatively 
unsuccessful. The original CDPP noise separation lever¬ 
aged off isolating nearly comparable terms, and benefited 
from a relatively narrow range of intrinsic stellar noise. 
The longer timescale CDPP encounters mnch more dis¬ 
crepant components in which the instrumental (or soft¬ 
ware inadequacy in dealing this these) terms are small 
compared to intrinsic stellar, and more importantly the 
stars show a broader distribution of intrinsic noise. We 
therefore concentrate on showing direct evaluations of 
the longer timescale CDPP for the Kepler stars, recog¬ 
nizing that if anything these will be over-estimates of 
the intrinsic stellar noise. We compute the solar metric 
using the same algorithms and codes used for the stars. 
The somewhat surprising results are shown in Figure 0 
Panels are included for analysis of both the simple aper¬ 
ture photometry (raw), and the calibrated data (release 
9.2, version for release 8.3 is identical for all intents) for 
which the distribution of stellar values (median for the 
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Figure 7. The upper panel shows a histogram of number of stars 
(of 4,529 total) at different levels of the long timescale CDPP, la¬ 
belled as stellar noise in parts per million based on the calibrated 
time series. The lower panel shows the same based on use of the 
direct, or raw data uncorrected for systematics. The mean and 
rms distribution for solar noise levels over quarter-long intervals 
spanning a solar Cycle are shown by the “-I-” and heavy horizontal 
line, with the full extent of solar noise per quarter the thin line. 

17 quarters is adopted for each star) is shown in relation 
to statistics on the corresponding solar values. Note that 
there is a strong cluster in the calibrated data to the low 
range of solar variability. Even with consideration of the 
raw-data time series for which no instrumental system¬ 
atics have been removed the mode for the stars is well 
below the mean for the Sun. 

Great effort has been expended in an attempt to make 
the primary feature (cluster of stellar values to low range 
of solar) go away. While one can never be certain of any 
result, we have been unable to resolve this finding. We 
have verified that our solar record in use is reasonable 
by comparing as in Figure 0] to an independent compi¬ 
lation. We have verified that the same code is used for 
the Sun and stars to form the CDPP. We have verified 
that the stellar and solar time series are normalized in 
the same way. Something that could explain the upper 
panel of Figure [7] would be significant suppression of stel¬ 
lar signal within our 8-15 day passband already by the 
Kepler pipeline processing. To pursue this we selected 
a subset of stars having CDPP near the mode of 100 
ppm in the bottom panel, that also fell near the much 
smaller mode near 20 ppm in the upper panel. We then 
visually inspected this subset looking for signals of in¬ 
termediate frequency (8 - 15 days) in the raw data that 
might have been improperly removed in creating the cal¬ 
ibrated data. While some intermediate frequencies could 
be seen in the raw data cases, these invariably seemed 
to be common mode variations across the several cases 
examined, and almost certainly not inherent stellar sig¬ 
nals. Although expecting that this (pipeline suppression 
of real stellar signals) was the most logical explanation 
for the distribution in the calibrated data of Figure [71 
we have been unable to find evidence in support of this 
contention. Indeed, having eliminated all potential con¬ 


tenders considered for an explanation we are left with 
accepting the seemingly improbable one that for these 
long timescales there is a large subset of the stars having 
activity levels near the minimum recently experienced by 
the Sun. However, the comparison shown in the upper 
panel of Figure [7] is also misleading in over-emphasizing a 
quiet distribution for the stars. To higher CDPP values 
there is a very long tail not shown in the figure. In¬ 
deed the number of stars with CDPP greater than the 
highest encountered by the Sun is 802, while the num¬ 
ber quieter than the lowest solar value is only 322. The 
mean over all stars is 352 ppm^, while the solar mean is 
151 ppm^. The medians switch to 99 ppm^ for the stars 
and 163 ppm^ for the Sun. Recall, though, that we have 
not made SVD-based adjustments for other non-stellar 
contributions to the CDPP. Although we believe such 
corrections would be minor at this long timescale, doing 
so would not change large values, but could shift some 
of the smaller (at < 30 ppm) stellar values to yet lower 
values. 

Table 3 shows the first five lines for the electronically 
available table documenting primary results in this pa¬ 
per. A total of 4529 stars brighter than Kp = 12.5 met 
the selection criteria for solar-type dwarfs as detailed in 
Paper 1. For each of these stars Table 3 provides the 
Kp value, the standard 6.5 hour CDPP analog, and the 
inferred intrinsic stellar noise for this based on analy¬ 
sis of all Kepler quarters and the latest data release as 
discussed in Section 3. Also provided are the 3.25 day 
CDPP raw and calibrated data values as summarized in 
Figure |7| of this section. 

4.2. Is the Kepler Dwarf Sample More or Less Noisy 
than the Sun? 

The title of this subsection is a seemingly simple ques¬ 
tion. The perhaps best simple answer would be: It de¬ 
pends. 

Previous, careful and reasonable studies that addressed 
this question came up with conflicting answers. BWR13 
and earlier studies sided with the Kepler stars be- 
ing at least as quiet as the Sun, w hile Paper 1 and 
IMcQuillan. Aigrain &: RobertsI (|20i2D sided with the Ke¬ 
pler stars on average being a bit more active than solar. 

For the long timescale CDPP detailed in this section, 
one measure is that more solar-type stars (giants have 
been excluded) have variations at a level higher than the 
most active Sun, than those having variations at a level 
lower than the least active Sun. However, the mode for 
the stellar distribution of activity levels is distinctly to¬ 
ward the quiet end of the solar range of variability. This 
latter feature persists were we to adopt our version of 
the 8-day MDV metric. This feature would also persist 
were we to adopt a long timescale CDPP metric at half 
the timescale, i.e. with a primary response function of 4 
- 7.5 days. 

Robust removal of instrumental signatures without in 
some cases suppressing real stellar signatures is undeni¬ 
ably a difficult problem. Careful inspection of many raw 
and calibrated time series suggests that there is no ob¬ 
vious issue with the pipeline being too aggressive and 
suppressing solar-type star intrinsic variations, although 
we cannot fully rule this out as a factor contributing to 
the stellar distribution. 

Therefore, perhaps the best answer to settle on is: We 
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Table 3 

Standard and long timescale CDPP values. 



Kp 


btellar JNoise 

xl2 CDPP(raw) 

Xl2 CP)PP(cal) 

1025494 

11.822 

24.85 

15.40 

154.13 

22.45 

1025986 

10.150 

119.56 

118.46 

5704.62 

5722.35 

1026669 

12.304 

28.21 

17.19 

244.60 

105.55 

1027030 

12.344 

30.40 

20.27 

184.49 

22.97 

1162051 

12.475 

53.24 

52.81 

982.96 

947.07 


Note. — All noise values are in ppm. The standard timescale values for our CDPP analog and inferred intrinsic stellar noise are 
discussed in Section 3. The longer timescale CDPP values corresponding to analysis of both raw and calibrated time series are discussed 
in Section 4. Full version of table is available online. 


don’t know, it depends. There probably is not a good, 
robust answer to the simple question posed in this sub¬ 
section. What does seem quite clear, though, in adopt¬ 
ing an answer is that the distribution of solar variability 
experienced over a recent solar cycle is well within the 
range that a large number of Kepler stars show on aver¬ 
age. The Sun is typical in the Kepler distribution, which 
is quite wide with a non-simple structure. 

5. SIMULATIONS OF STELLAR NOISE 

In Paper 1 we included extensive discussion of using 
the galactic popula tion synthesis package TRILEGAL 
(|Girardi et al.ll2000ll to provide a simulated set of stars 
appropriate for the Kepler field of view. This was fol¬ 
lowed by detailed discussion of granulation and stellar 
activity contributions, the two of which were modelled as 
a function of the TRILEGAL generated stellar parame¬ 
ters (mass and age). Normalization was accomplished 
for the activity contribution through co nsideration of 
both g r ound based studi e s as p resent ed in iRadick et al.l 
(| 199811 . iLockwood et al.l (1200711 . and iHall et al.l (I2009ll 7 
as well as reference to sola r variations as measured by 
SOHO (|Frbhlich et ahllTo^ . 

With both the simulated stellar parameters and codes 
available from the Paper I study, we have made only one 
change: adoption of the transfer function shown in Fig¬ 
ure [5] for our x 12 longer timescale GDPP metric. Since 
for this much longer timescale metric we expect the stel¬ 
lar contributions at 12th magnitude to generally domi¬ 
nate over Poisson, readout noise and instrumental terms 
we have provided only the stellar terms from the simula¬ 
tion. 

Figure [8] shows the resulting distribution of simulated 
stellar noise at the 3.25 day GDPP timescale considered 
in the previous section. The agreement with observations 
as shown in Figure [7] is generally quite good. Stars with 
parameters close to solar map into mid-range of the so¬ 
lar variation as measured directly from the SOHO data. 
Most importantly the strong peak at low noise levels - es¬ 
sentially a pile-up near the lower range of solar variability 
levels experienced over a solar cycle, is reproduced in the 
simulations. Since the simulation codes were not tuned 
to reproduce the distribution seen in the real data of Fig- 
ure[71 we take the general agreement as confirmation that 
the distribution of noise seen in the real Kepler data is 
a reasonable representation of reality. The consistency 
between real and simulated data further demonstrates 
that the pipeline is not significantly suppressing stellar 
signals in our bandpass. 

The population of stars in Figure [7] at GDPP values 
less than 70 ppm presumably arises from two factors. 
The fraction of all stars sampled falling below 70 ppm is 


Kp = ll .5 to 12.5 



Stellar noise (ppm) 

Figure 8. Simulated distribution of stellar noise arising from ac¬ 
tivity and granulation for stars with Kp = 11.5 - 12.5 as modelled 
in Paper 1, with adoption of the transfer function of Figure [5] ap¬ 
propriate to the 3.25 day CDPP metric defined in Section 4. 

38%. The first factor is that ^20% of the time the Sun 
is this quiet. The second factor is that 20% of the stars 
in the simulations of Figure |5] have ages greater than 5 
Gyr. Thus the very quiet stars sampled by Kepler may 
arise equally from stars similar to the Sun, and in quiet 
phases of activity cycles, and from stars inherently older 
than solar. 

6. SUMMARY 

We have repeated an earlier analysis studying noise in 
Kepler data at timescales relevant to the detection of 
exoplanet transits using much longer time intervals, and 
making use of more recent data products. The inferred 
intrinsic stellar noise stayed fixed with adoption of more, 
and newer data, thus providing confidence in the analy¬ 
ses. The inferred residual noise arising from the instru¬ 
ment dropped with the consideration of newer data prod¬ 
ucts, this of course would be expected since the software 
updates had been intended to do this. Residual noise as 
a function of time during the Kepler mission correlates 
well with solar activity. The earlier study by us (Paper 
1) had shown a strong correlation between excess noise 
by-channel with the mean focus offset of the channels 
(in the sense that fuzzier images had poorer photome¬ 
try). That correlation is still present considering all of 
the data, and the most recent data release, but is now 
relatively weak, consistent with most possible gains in 
suppressing instrumental noise now being in hand. 

We have explored a longer timescale metric better 
suited to elucidating levels of stellar magnetic activity 
induced variations. This has shown mixed results. We 
find that the spread of solar variations over a recent cycle 


























10 


Gilliland et al. 


are well within the spread of mean noise levels for a large 
sample of solar-type stars. We also find that there is a 
strong concentration of Kepler noise levels near the min- 
imnm values reached by the Sun. We have not been able 
to find evidence in support of any conclusion for this, ex¬ 
cept the one directly presented: there seem to be many 
Kepler solar-type stars that are as quiet as the quiet 
Sun, more than we expected based on either the earlier 
(Paper 1) study using a metric less well suited to char¬ 
acterizing stars, or to modelling of expecte d noise levels 
using galactic population synthesis models (jPobin et al.l 
1200, TRILEGAL) suggesting an age distribution aver¬ 
aging younger than the Sun. A direct simulation for 
the long timescale does, however, show results consistent 
with the observations. A significant fraction of stars are 
older, and hence quieter than the Sun even though as 
argued in Paper 1 the overall age distribution is younger 
than solar. As such this study shows that the Sun may 
be considered typical of the Kepler distribution of solar- 
type star activity levels. The significant fraction of stars 
with activity levels at or below the quiet Sun is a g en- 
erally positive result for habitability (jSee et aLll20l4l of 
potential Earth-analogs in the Kepler field. A simple 
answer to the question of whether the Sun is quieter or 
noisier than the Kepler sample has not been reached. 
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